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AMENDMENTS TO THE CLAIMS 

1 . (Currently amended) At least one computer readable medium encoded with instructions that, 
when executed by at least one processor, perform a [[A]] method for generating speech recognition 
models, the method comprising: 

conv e rting speech spoken from a plurality of femal e sp e ak e rs into a female s e t of record e d 
phonemes training data; 

converting speech spoken from a plurality of mal e sp e ak e rs into a mal e s e t of r e cord e d 
phon e mes training data; 

receiving a female speech recognition model of phoneme models based on the a female set 
of recorded phonemes training data; 

receiving a male speech recognition model of phoneme models based on the a male set of 
recorded phonemes training data; 

determining a difference in model information between pairs of corresponding phoneme 
models of the i#st female speech recognition model and the s e cond male speech recognition model; 
and 

creating a gender-independent speech recognition model that includes a gender-independent 
phoneme model based on the femal e s e t of r e corded phon e m e s training data and th e mal e s e t of 
r e corded phon e m e s training data if a pair of corresponding phoneme models of the female speech 
recognition model and the male speech recognition model when the difference in model information 
between the phoneme models of the pair of corresponding phoneme models is insignificant. 

2. (Currently amended) The m e thod at least one computer readable medium of claim 1 , further 
comprising removing each of the phoneme models of the pair of corresponding phoneme models 
from the female speech recognition model and the male speech recognition model when the 
difference in model information between the phoneme models is insignificant wher e in wheth e r th e 
model information is insignificant is bas e d on a thr e shold mod e l quantity . 
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3 . (Currently amended) The m e thod at least one computer readable medium of claim 1 , 
wherein determining the difference in model information includes calculating a Kullback Leibler 
distance between the first speech recognition model and second speech recognition model. 

4. (Currently amended) The m e thod at least one computer readable medium of claim 3, 
wherein whether the model information is insignificant is based on a threshold Kullback Leibler 
distance quantity. 

5. (Currently amended) The method at least one computer readable medium of claim 1 , 
wherein the female speech recognition model, die male speech recognition model, and the gender- 
independent speech recognition model are Gaussian mixture models. 

6. (Currently amended) A system for generating speech recognition models, the system 
comprising: 

a computer processor; 

a first speech recognition model of phoneme models based on a first set of training data, the 
first set of training data originating from a first set of common entities; 

a second speech recognition model of phoneme models based on a second set of training 
data, the second set of training data originating from a second set of common entities; and 

a processing module configured to create an independent speech recognition model that 
includes an independent phoneme model based on a pair of corresponding phoneme models of the 
first speech recognition model and the second speech recognition model when th e first s e t of 
training data and the second s e t of training data if the difference in model information between the 
phoneme models of the pair of corresponding phoneme models first spe e ch recognition model and 
the second sp e ech recognition mod e l is insignificant. 

7. (Currently amended) The system of claim 6, wherein the processing module is configured to 
remove each of the phoneme models of the pair of corresponding phoneme models from the first 
speech recognition model and the second speech recognition mode when the difference in model 
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information between the phoneme models is insignificant whether the model information is 
insignificant is based on a threshold model quantity . 

8. (Previously presented) The system of claim 6, wherein the processing model is further 
configured to calculate a Kullback Leibler distance between the first speech recognition model and 
second speech recognition model. 

9. (Original) The system of claim 8, wherein whether the model information is insignificant is 
based on a threshold Kullback Leibler distance quantity. 

1 0. (Currently amended) The system of claim 6, wherein the first speech recognition model, the 
second speech recognition model, and the independent speech recognition model are Gaussian 
mixture models. 

11. (Currently amended) A computer program product embodied in computer memory 
comprising: 

computer readable program codes coupled to the computer memory for generating speech 
recognition models, the computer readable program codes configured to cause the program to: 

receive a first speech recognition model of phoneme models based on a first set of training 
data, the first set of training data originating from a first set of common entities; 

receive a second speech recognition model of phoneme models based on a second set of 
training data, the second set of training data originating from a second set of common entities; 

determine a difference in model information between pairs of corresponding phoneme 
models of the first speech recognition model and the second speech recognition model; and 

create an independent speech recognition model that includes an independent phoneme 
model based on a pair of corresponding phoneme models of the first speech recognition model and 
the second speech recognition model when the first set of training data and th e second set of training 
data if the difference in model information between the phoneme models of the pair of 
corresponding phoneme models is insignificant. 
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1 2. (Currently amended) The computer program product of claim 1 1 , wherein the computer 
readable program codes configured to cause the program to remove each of the phoneme models of 
the pair of corresponding phoneme models from the first speech recognition model and the second 
speech recognition model when the difference in model information between the phoneme models is 
insignificant whether the model information is insignificant is based on a threshold mod e l quantity . 

1 3 . (Original) The computer program product of claim 1 1 , wherein determining the difference in 
model information includes calculating a Kullback Leibler distance between the first model and 
second model. 

14. (Original) The computer program product of claim 13, wherein whether the model 
information is insignificant is based on a threshold Kullback Leibler distance quantity. 

15. (Currently amended) The computer program product of claim 1 1 , wherein the first speech 
recognition model, the second speech recognition model, and the independent speech recognition 
model mod e ls are Gaussian mixture models. 

1 6. (Currently amended) A system for generating speech recognition models, the method 
comprising: 

a computer processor; 

a first speech recognition model of phoneme models based on a first set of training data, the 
first set of training data originating from a first set of common entities; 

a second speech recognition model of phoneme models based on a second set of training 
data, the second set of training data originating from a second set of common entities; and 

means for creating an independent speech recognition model that includes an independent 
phoneme model based on a pair of corresponding phoneme models of the first speech recognition 
model and the second speech recognition model when th e first set of training data and th e second set 
of training data if the difference in model information between the phoneme models of the pair of 
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corresponding phoneme models first sp ee ch recognition modol and th e socond speech recognition 
mod e l is insignificant. 

17. (Currently amended) At least one computer readable medium encoded with instructions 
that, when executed by at least one processor, perform a [[A]] method for recognizing speech from 
an audio stream originating from one of a plurality of data classes, the method comprising: 

converting th e sp ee ch into th e audio str e am; 

receiving a current feature vector of the audio stream; 

computing a current vector probability that the current feature vector belongs to one of the 
plurality of data classes; 

computing an accumulated confidence level that the audio stream belongs to one of the 
plurality of data classes based on the current vector probability and on previous vector probabilities; 
weighing class models based on the accumulated confidence; and 
recognizing the current feature vector based on the weighted class models; and 
wherein the plurality of data classes include a female speech recognition model based on 
recorded phonemes originating from a plurality of female speakers, a male speech recognition 
model based on recorded phonemes originating from a plurality of male speakers, and a gender- 
independent speech recognition model that includes independent phoneme models based on pairs of 
corresponding recorded phonemes originating from the plurality of both female speakers and the 
plurality of male speakers having insignificant differences in model informatio n between the 
recorded phonemes of the pair of corresponding recorded phonemes, each of the female speech 
recognition model and the male speech recognition model lacking the phoneme models of the 
gender-independent speech recognition model based on pairs of corresponding recorded phonemes 
originating from the plurality of female speakers and the plurality of male speakers having 
insignificant differences in model information between the recorded phonemes of pairs of 
corresponding recorded phonemes . 
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18. (Currently amended) The method at least one computer readable medium of claim 1 7, 
wherein computing the current vector probability includes estimating an a posteriori class 
probability for the current feature vector. 

19. (Currently amended) The m e thod at least one computer readable medium of claim 1 7, 
wherein computing the accumulated confidence level further comprising weighing the current 
vector probability more than the previous vector probabilities. 

20. (Currently amended) The m e thod at least one computer readable medium of claim 17, the 
method further comprising determining if another feature vector is available for analysis. 

21 . (Currently amended) A system for recognizing speech data from an audio stream originating 
from one of a plurality of data classes, the system comprising: 

a computer processor; 

a receiving module configured to receive a current feature vector of the audio stream; 

a first computing module configured to compute a current vector probability that the current 
feature vector belongs to one of the plurality of data classes; 

a second computing module configured to compute an accumulated confidence level that the 
audio stream belongs to one of the plurality of data classes based on the current vector probability 
and on previous vector probabilities; 

a weighing module configured to weigh class models based on the accumulated confidence; 

and 

a recognizing module configured to recognize the current feature vector based on the 
weighted class models; and 

wherein the plurality of data classes include a first speech recognition model based on 
recorded phonemes originating from a first set of speakers, a second speech recognition model 
based on recorded phonemes from a second set of speakers, and a third speech recognition model 
that includes phoneme models based on pairs of corresponding recorded phonemes originating from 
both the first and second set of speakers having insignificant differences in model information 
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between the recorded phonemes of the pair of corresponding recorded phonemes, each of the first 
speech recognition model and the second speech recognition model lacking the phoneme models of 
the third speech recognition model based on pairs of corresponding recorded phonemes originating 
from both the first and second set of speakers having insignificant differences in model information 
between the recorded phonemes of the pairs of corresponding recorded phonemes . 

22. (Original) The system of claim 21, wherein the first computing module is further configured 
to estimate an a posteriori class probability for the current feature vector. 

23. (Original) The system of claim 21, wherein the second computing module is further 
configured to weigh the current vector probability more than the previous vector probabilities. 

24. (Currently amended) A computer program product embodied in computer memory 
comprising: 

computer readable program codes coupled to the computer memory for recognizing speech 
data from an audio stream originating from one of a plurality of data classes, the computer readable 
program codes configured to cause the program to: 

receive a current feature vector of the audio stream; 

compute a current vector probability that the current feature vector belongs to one of the 
plurality of data classes; 

compute an accumulated confidence level that the audio stream belongs to one of the 
plurality of data classes based on the current vector probability and on previous vector probabilities; 
weigh class models based on the accumulated confidence; and 
recognize the current feature vector based on the weighted class models; and 
wherein the plurality of data classes include a first speech recognition model based on 
recorded phonemes originating from a first set of speakers, a second speech recognition model 
based on recorded phonemes from a second set of speakers, and a third speech recognition model 
that includes phoneme models based on pairs of corresponding recorded phonemes originating from 
both the first and second set of speakers having insignificant differences in model information 
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between the recorded phonemes of the pairs of corresponding recorded phonemes, each of the first 
speech recognition model and the second speech recognition model lacking the phoneme models of 
the third speech recognition model based on pairs of corresponding recorded phonemes originating 
from both the first and second set of speakers having insignificant differences in model information 
between the recorded phonemes of the pairs of corresponding recorded phonemes . 

25. (Original) The computer program product of claim 24, wherein the program code configured 
to compute the current vector probability includes program code configured to determine an a 
posteriori class probability for the current feature vector. 

26. (Original) The computer program product of claim 24, wherein the program code configured 
to compute the accumulated confidence level includes program code configured to weigh the 
current vector probability more than the previous vector probabilities. 

27. (Original) The computer program product of claim 24, further comprising program code 
configured to determine if another feature vector is available for analysis. 
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